Gammatone auditory filterbank and independent component analysis for speaker identification
نویسندگان
چکیده
Feature extraction is the key procedure when aiming at robust speaker identification. The most commonly used feature extraction techniques work successfully only in clean or matched environments. Accurate speaker identification is made difficult due to a number of factors, with handset/channel mismatch and environmental noise being the most prominent. This paper presents a novel technique which based on Gammatone filterbank (GTF) and independent component analysis (ICA). The presented method first relies on the Gammatone filterbank to emulate the human cochlea frequency resolution. By using ICA, it extracts the dominant components from these frequency banks. The extracted features emphasis the difference in the statistical structures among the speakers, which can model the distribution of the individuals. Compared to the commonly used techniques, such as linear predictive cepstral coefficients (LPCC), Melfrequency cepstrum coefficients (MFCC) and perceptual linear predictive (PLP), the proposed method is more robust to additive noises and yields higher recognition rate in mismatch environments in a text-independent speaker identification system.
منابع مشابه
Voice biometric feature using Gammatone filterbank and ICA
Voice biometric feature extraction is the core task in developing any speaker identification system. This paper proposes a robust feature extraction technique for the purpose of speaker identification. The technique is based on processing monaural speech signal using human auditory system based Gammatone Filterbank (GTF) and Independent Component Analysis (ICA). The measures used to assess the ...
متن کاملGammatone Auditory Filterbank and Independent Component Analysis for Speaker Identification systems
ABSTACT Speaker identification is the process of recognizing who is speaking on the basis of information extracted from the speech signal. It has a number of applications in security and voice controlled service area. However, the most commonly used speaker recognition techniques work successfully only in clean or matched environment. Accurate speaker identification is made difficult due to a n...
متن کاملSpectro-temporal features for robust far-field speaker identification
Features derived from an auditory spectro-temporal representation of speech are proposed for robust far-field speaker identification. The auditory representation is obtained by first filtering the speech signal with a gammatone filterbank. A modulation filterbank is then applied to the temporal envelope of each gammatone filter output. Compared to commonly used mel-frequency cepstral coefficien...
متن کاملInvestigating the use of a Gammatone filterbank for a cochlear implant coding strategy
BACKGROUND Contemporary speech processing strategies in cochlear implants (CIs) such as the Advanced Combination Encoder (ACE) use a standard Fast Fourier Transform (FFT) filterbank to extract envelopes. The assignment of the FFT bins to approximate the frequency resolution of the basilar membrane is only partly based on physiology, especially since the bins are distributed linearly below 1000H...
متن کاملAn Efficient Auditory Fil Terbank Based on the Gammatone Function
This paper describes the development of an auditory filterbank to perform the initial frequency analysis in models of human hearing and speech perception. It is based on the gammatone function used by physiologists to summarise 'revcor' measurements of the impulse response of the auditory filter in small mammals. The first section shows that the amplitude characteristic of the gammatone functio...
متن کامل